Poniższy raport ma na celu podjęcie analizy danych pochodzących z bazy Protein Data Bank (PDB). Dane zawierają informacje na temat ligandów. Zbór danych zawiera między innymi nazwę danej cząsteczki chemicznej, ilość atomów oraz elektronów oraz inne kolumny oparte o trójwymiarowy fragment gęstości elektronowej struktury. Przy analizie pominięte zostały kolumny utworzone przy pomocy wartości słownikowych. Ze względu na problem ze środowiskiem, zbiór początkowy ograniczono do 400 000 wierszy.
library(EDAWR)
library(dplyr)
library(DT)
library(ggplot2)
library(plotly)
library(reshape2)
library(cowplot)
library(data.table)
library(qwraps2)
library(fastDummies)
library(reshape2)
library(caret)
library(kableExtra)
library(pROC)
set.seed(123)
initial<-fread("all_summary.csv", nrows = 100)
colClass <- sapply(initial, class)
pdb_table<-fread("all_summary.csv", nrows = 400000, colClasses = colClass)
for(i in 1:ncol(pdb_clear_res_name_table)){
pdb_clear_res_name_table[is.na(pdb_clear_res_name_table[,i]), i] <- mean(pdb_clear_res_name_table[,i], na.rm = TRUE)
}
Ze zbioru usuniete zostaly wiersze posiadajace wartosci zmiennej res_name rozne od: UNK, UNX, UNL, DUM, N, BLOB, ALA, ARG, ASN, ASP, CYS, GLN, GLU, GLY, HIS, ILE, LEU, LYS, MET, MSE, PHE, PRO, SEC, SER, THR, TRP, TYR, VAL, DA, DG, DT, DC, DU, A, G, T, C, U, HOH, H20, WAT. Zbior ograniczono do kolumn opisanych na stronie projektu, nie uwzgledniono kolumn nie wykorzystywanych do klasyfikacji poza kolumnami res_name, local_res_atom_non_h_count, local_res_atom_non_h_count, dict_atom_non_h_count, dict_atom_non_h_electron_sum. Wartosci ‘Na’ zostaly zastapione srednia wartoscia dla danej kolumny.
Zbior przed wyczyszczeniem posiadał wymiar: 400000, 412 [wierszy, kolumn]. Po oczyszczeniu zbioru wymiary wynosza: 393259, 336 [wierszy, kolumn].
| res_name | |
|---|---|
| Length:393259 | |
| Class :character | |
| Mode :character |
| mean | sd | median | min | max | class | |
|---|---|---|---|---|---|---|
| local_res_atom_non_h_count | 1.353000e+01 | 1.514000e+01 | 6.000000e+00 | 1.00 | 1.060000e+02 | numeric |
| local_res_atom_non_h_electron_sum | 1.003800e+02 | 1.019700e+02 | 4.800000e+01 | 3.00 | 1.848000e+03 | numeric |
| dict_atom_non_h_count | 1.387000e+01 | 1.561000e+01 | 6.000000e+00 | 1.00 | 1.260000e+02 | numeric |
| dict_atom_non_h_electron_sum | 1.029000e+02 | 1.045600e+02 | 5.200000e+01 | 3.00 | 1.223000e+03 | numeric |
| local_volume | 8.564600e+02 | 1.467780e+03 | 3.438200e+02 | 49.25 | 9.095251e+04 | numeric |
| local_electrons | 1.769000e+01 | 2.526000e+01 | 7.770000e+00 | 0.01 | 4.424400e+02 | numeric |
| local_mean | 2.000000e-02 | 2.000000e-02 | 2.000000e-02 | 0.00 | 3.700000e-01 | numeric |
| local_std | 1.200000e-01 | 1.000000e-01 | 1.000000e-01 | 0.00 | 1.960000e+00 | numeric |
| local_min | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.00 | 0.000000e+00 | numeric |
| local_max | 1.350000e+00 | 1.590000e+00 | 8.900000e-01 | 0.03 | 4.463000e+01 | numeric |
| local_skewness | 2.200000e-01 | 1.900000e-01 | 1.700000e-01 | 0.01 | 4.040000e+00 | numeric |
| part_00_shape_segments_count | 3.388500e+02 | 1.110370e+03 | 2.500000e+01 | 0.00 | 1.145770e+05 | numeric |
| part_00_density_segments_count | 3.388500e+02 | 1.110370e+03 | 2.500000e+01 | 0.00 | 1.145770e+05 | numeric |
| part_00_volume | 3.287000e+01 | 5.051000e+01 | 1.422000e+01 | 0.00 | 2.427940e+03 | numeric |
| part_00_electrons | 1.749000e+01 | 2.511000e+01 | 7.710000e+00 | 0.00 | 4.411400e+02 | numeric |
| part_00_mean | 6.000000e-01 | 4.000000e-01 | 5.100000e-01 | 0.00 | 8.600000e+00 | numeric |
| part_00_std | 2.100000e-01 | 3.000000e-01 | 1.200000e-01 | 0.00 | 8.010000e+00 | numeric |
| part_00_max | 1.350000e+00 | 1.590000e+00 | 8.900000e-01 | 0.00 | 4.463000e+01 | numeric |
| part_00_max_over_std | 9.740000e+00 | 7.600000e+00 | 7.220000e+00 | 0.00 | 1.732500e+02 | numeric |
| part_00_skewness | 2.100000e-01 | 3.400000e-01 | 1.100000e-01 | 0.00 | 1.051000e+01 | numeric |
| part_00_parts | 1.070000e+00 | 3.800000e-01 | 1.000000e+00 | 0.00 | 2.800000e+01 | numeric |
| part_00_shape_O3 | 1.695558e+06 | 9.850960e+06 | 9.601052e+04 | 121.39 | 2.293302e+09 | numeric |
| part_00_shape_O4 | 1.116615e+13 | 9.992375e+14 | 2.274156e+09 | 3807.05 | 4.027551e+17 | numeric |
| part_00_shape_O5 | 1.916881e+20 | 5.147942e+22 | 1.481914e+13 | 33777.03 | 2.908412e+25 | numeric |
| part_00_shape_FL | 6.023041e+16 | 1.439510e+19 | 3.770154e+10 | 75.70 | 5.852600e+21 | numeric |
| part_00_shape_O3_norm | 4.900000e-01 | 3.300000e-01 | 3.800000e-01 | 0.23 | 3.965000e+01 | numeric |
| part_00_shape_O4_norm | 6.000000e-02 | 8.000000e-02 | 3.000000e-02 | 0.02 | 6.010000e+00 | numeric |
| part_00_shape_O5_norm | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.00 | 4.100000e-01 | numeric |
| part_00_shape_FL_norm | 6.000000e-02 | 5.900000e-01 | 1.000000e-02 | 0.00 | 1.893500e+02 | numeric |
| part_00_shape_I1 | 3.252631e+09 | 3.464861e+11 | 7.712680e+06 | 528.82 | 1.633222e+14 | numeric |
| part_00_shape_I2 | 2.919856e+20 | 7.877431e+22 | 8.701642e+12 | 56939.71 | 3.482558e+25 | numeric |
| part_00_shape_I3 | 1.178734e+23 | 4.974270e+25 | 2.430589e+13 | 121692.16 | 2.665214e+28 | numeric |
| part_00_shape_I4 | 3.480335e+16 | 7.655498e+18 | 2.063800e+10 | 42.47 | 2.867703e+21 | numeric |
| part_00_shape_I5 | 1.785198e+16 | 3.389322e+18 | 6.199766e+09 | 6.19 | 1.540703e+21 | numeric |
| part_00_shape_I6 | 2.049683e+18 | 7.101063e+20 | 3.367402e+11 | 28626.18 | 3.743294e+23 | numeric |
| part_00_shape_I1_norm | 5.700000e-01 | 6.750000e+00 | 2.300000e-01 | 0.06 | 2.760570e+03 | numeric |
| part_00_shape_I2_norm | 9.000000e-02 | 1.100000e+00 | 1.000000e-02 | 0.00 | 3.039100e+02 | numeric |
| part_00_shape_I3_norm | 4.553000e+01 | 1.461936e+04 | 3.000000e-02 | 0.00 | 7.617375e+06 | numeric |
| part_00_shape_I4_norm | 4.000000e-02 | 5.800000e-01 | 0.000000e+00 | 0.00 | 1.890700e+02 | numeric |
| part_00_shape_I5_norm | 3.000000e-02 | 5.800000e-01 | 0.000000e+00 | 0.00 | 1.888900e+02 | numeric |
| part_00_shape_I6_norm | 1.120000e+00 | 2.293800e+02 | 4.000000e-02 | 0.00 | 1.094147e+05 | numeric |
| part_00_shape_M000 | 4.109220e+03 | 6.313310e+03 | 1.778000e+03 | 38.00 | 3.034930e+05 | numeric |
| part_00_shape_CI | 4.000000e-02 | 4.120000e+00 | 0.000000e+00 | -129.45 | 7.004000e+01 | numeric |
| part_00_shape_E3_E1 | 2.400000e-01 | 2.000000e-01 | 1.700000e-01 | 0.00 | 9.900000e-01 | numeric |
| part_00_shape_E2_E1 | 4.200000e-01 | 2.400000e-01 | 3.800000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_00_shape_E3_E2 | 5.500000e-01 | 2.300000e-01 | 5.700000e-01 | 0.01 | 1.000000e+00 | numeric |
| part_00_shape_sqrt_E1 | 8.030000e+00 | 5.950000e+00 | 5.870000e+00 | 1.24 | 2.027600e+02 | numeric |
| part_00_shape_sqrt_E2 | 4.420000e+00 | 2.730000e+00 | 3.510000e+00 | 0.74 | 3.452000e+01 | numeric |
| part_00_shape_sqrt_E3 | 2.940000e+00 | 1.420000e+00 | 2.580000e+00 | 0.60 | 1.993000e+01 | numeric |
| part_00_density_O3 | 7.961914e+05 | 2.894272e+06 | 4.686123e+04 | 9.71 | 4.315815e+08 | numeric |
| part_00_density_O4 | 1.360434e+12 | 3.255212e+13 | 5.451315e+08 | 24.38 | 1.207528e+16 | numeric |
| part_00_density_O5 | 1.589070e+18 | 1.517293e+20 | 1.722258e+12 | 17.32 | 5.920612e+22 | numeric |
| part_00_density_FL | 3.306074e+15 | 6.223221e+17 | 8.229923e+09 | -3.01 | 3.058302e+20 | numeric |
| part_00_density_O3_norm | 7.500000e-01 | 1.070000e+00 | 6.100000e-01 | 0.04 | 4.123300e+02 | numeric |
| part_00_density_O4_norm | 1.500000e-01 | 2.200000e-01 | 9.000000e-02 | 0.00 | 3.293000e+01 | numeric |
| part_00_density_O5_norm | 1.000000e-02 | 2.000000e-02 | 0.000000e+00 | 0.00 | 4.430000e+00 | numeric |
| part_00_density_FL_norm | 3.800000e-01 | 4.728000e+01 | 2.000000e-02 | -0.03 | 2.927105e+04 | numeric |
| part_00_density_I1 | 1.058597e+09 | 3.301963e+10 | 3.414976e+06 | 42.22 | 1.282083e+13 | numeric |
| part_00_density_I2 | 1.219865e+19 | 2.907814e+21 | 1.725746e+12 | 363.10 | 1.380658e+24 | numeric |
| part_00_density_I3 | 1.003384e+21 | 3.220425e+23 | 4.769437e+12 | 775.23 | 1.642551e+26 | numeric |
| part_00_density_I4 | 1.996255e+15 | 3.413131e+17 | 4.943933e+09 | -1.01 | 1.713566e+20 | numeric |
| part_00_density_I5 | 1.123042e+15 | 1.563157e+17 | 1.945365e+09 | 0.18 | 8.170753e+19 | numeric |
| part_00_density_I6 | 3.604626e+16 | 7.467917e+18 | 7.076168e+10 | 182.74 | 2.905409e+21 | numeric |
| part_00_density_I1_norm | 2.940000e+00 | 5.121100e+02 | 5.800000e-01 | 0.00 | 2.985910e+05 | numeric |
| part_00_density_I2_norm | 7.900000e-01 | 2.053000e+01 | 5.000000e-02 | 0.00 | 7.616410e+03 | numeric |
| part_00_density_I3_norm | 2.621333e+05 | 1.425401e+08 | 1.500000e-01 | 0.00 | 8.911718e+10 | numeric |
| part_00_density_I4_norm | 3.100000e-01 | 4.721000e+01 | 1.000000e-02 | -0.01 | 2.923185e+04 | numeric |
| part_00_density_I5_norm | 2.700000e-01 | 4.716000e+01 | 0.000000e+00 | 0.00 | 2.920571e+04 | numeric |
| part_00_density_I6_norm | 4.339200e+02 | 1.991169e+05 | 1.700000e-01 | 0.00 | 1.230801e+08 | numeric |
| part_00_density_M000 | 2.186050e+03 | 3.139180e+03 | 9.634300e+02 | 3.05 | 5.514218e+04 | numeric |
| part_00_density_CI | 4.000000e-02 | 4.700000e+00 | 0.000000e+00 | -155.70 | 8.996000e+01 | numeric |
| part_00_density_E3_E1 | 2.500000e-01 | 2.000000e-01 | 1.700000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_00_density_E2_E1 | 4.200000e-01 | 2.500000e-01 | 3.800000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_00_density_E3_E2 | 5.600000e-01 | 2.300000e-01 | 5.800000e-01 | 0.01 | 1.000000e+00 | numeric |
| part_00_density_sqrt_E1 | 7.720000e+00 | 5.840000e+00 | 5.550000e+00 | 1.24 | 2.024800e+02 | numeric |
| part_00_density_sqrt_E2 | 4.200000e+00 | 2.630000e+00 | 3.270000e+00 | 0.74 | 3.280000e+01 | numeric |
| part_00_density_sqrt_E3 | 2.780000e+00 | 1.340000e+00 | 2.430000e+00 | 0.60 | 1.938000e+01 | numeric |
| part_00_shape_Z_7_3 | 4.093000e+01 | 3.645000e+01 | 2.658000e+01 | 6.30 | 5.587100e+02 | numeric |
| part_00_shape_Z_0_0 | 2.615000e+01 | 1.723000e+01 | 2.060000e+01 | 3.01 | 2.691700e+02 | numeric |
| part_00_shape_Z_7_0 | 1.735000e+01 | 1.671000e+01 | 1.015000e+01 | 0.85 | 3.669900e+02 | numeric |
| part_00_shape_Z_7_1 | 2.813000e+01 | 2.593000e+01 | 1.757000e+01 | 3.66 | 4.461400e+02 | numeric |
| part_00_shape_Z_3_0 | 1.505000e+01 | 1.296000e+01 | 1.066000e+01 | 0.50 | 2.081300e+02 | numeric |
| part_00_shape_Z_5_2 | 3.493000e+01 | 2.908000e+01 | 2.470000e+01 | 4.58 | 4.551000e+02 | numeric |
| part_00_shape_Z_6_1 | 3.166000e+01 | 2.848000e+01 | 2.085000e+01 | 1.81 | 4.762100e+02 | numeric |
| part_00_shape_Z_3_1 | 2.437000e+01 | 1.902000e+01 | 1.807000e+01 | 2.51 | 2.972800e+02 | numeric |
| part_00_shape_Z_6_0 | 1.479000e+01 | 1.402000e+01 | 9.890000e+00 | 0.02 | 2.990100e+02 | numeric |
| part_00_shape_Z_2_1 | 3.825000e+01 | 2.770000e+01 | 2.870000e+01 | 2.75 | 4.208100e+02 | numeric |
| part_00_shape_Z_6_3 | 4.648000e+01 | 4.064000e+01 | 3.112000e+01 | 4.11 | 6.084300e+02 | numeric |
| part_00_shape_Z_2_0 | 2.809000e+01 | 1.984000e+01 | 2.179000e+01 | 1.32 | 3.265300e+02 | numeric |
| part_00_shape_Z_6_2 | 4.197000e+01 | 3.728000e+01 | 2.791000e+01 | 2.94 | 5.622000e+02 | numeric |
| part_00_shape_Z_5_0 | 1.827000e+01 | 1.712000e+01 | 1.218000e+01 | 0.88 | 3.155700e+02 | numeric |
| part_00_shape_Z_5_1 | 2.887000e+01 | 2.459000e+01 | 2.043000e+01 | 3.46 | 4.074900e+02 | numeric |
| part_00_shape_Z_4_2 | 4.374000e+01 | 3.569000e+01 | 3.133000e+01 | 3.58 | 5.344700e+02 | numeric |
| part_00_shape_Z_1_0 | 1.430000e+00 | 2.100000e-01 | 1.410000e+00 | 0.74 | 2.400000e+00 | numeric |
| part_00_shape_Z_4_1 | 3.763000e+01 | 3.120000e+01 | 2.698000e+01 | 1.95 | 4.656000e+02 | numeric |
| part_00_shape_Z_7_2 | 3.638000e+01 | 3.305000e+01 | 2.330000e+01 | 5.96 | 5.303200e+02 | numeric |
| part_00_shape_Z_4_0 | 2.046000e+01 | 1.753000e+01 | 1.485000e+01 | 0.03 | 3.130900e+02 | numeric |
| part_00_density_Z_7_3 | 3.030000e+01 | 2.754000e+01 | 1.942000e+01 | 2.89 | 2.059200e+02 | numeric |
| part_00_density_Z_0_0 | 1.904000e+01 | 1.262000e+01 | 1.517000e+01 | 0.85 | 1.147400e+02 | numeric |
| part_00_density_Z_7_0 | 1.486000e+01 | 1.374000e+01 | 8.520000e+00 | 0.98 | 1.269200e+02 | numeric |
| part_00_density_Z_7_1 | 2.230000e+01 | 2.042000e+01 | 1.391000e+01 | 2.88 | 1.599200e+02 | numeric |
| part_00_density_Z_3_0 | 1.129000e+01 | 9.710000e+00 | 7.760000e+00 | 0.42 | 8.854000e+01 | numeric |
| part_00_density_Z_5_2 | 2.552000e+01 | 2.161000e+01 | 1.772000e+01 | 2.15 | 1.821800e+02 | numeric |
| part_00_density_Z_6_1 | 2.469000e+01 | 2.254000e+01 | 1.713000e+01 | 0.51 | 1.746700e+02 | numeric |
| part_00_density_Z_3_1 | 1.729000e+01 | 1.388000e+01 | 1.244000e+01 | 1.44 | 1.208600e+02 | numeric |
| part_00_density_Z_6_0 | 1.253000e+01 | 1.254000e+01 | 8.040000e+00 | 0.01 | 1.224900e+02 | numeric |
| part_00_density_Z_2_1 | 2.801000e+01 | 2.003000e+01 | 2.147000e+01 | 0.91 | 1.762500e+02 | numeric |
| part_00_density_Z_6_3 | 3.422000e+01 | 3.068000e+01 | 2.352000e+01 | 1.16 | 2.600800e+02 | numeric |
| part_00_density_Z_2_0 | 2.156000e+01 | 1.486000e+01 | 1.696000e+01 | 0.51 | 1.352800e+02 | numeric |
| part_00_density_Z_6_2 | 3.148000e+01 | 2.846000e+01 | 2.167000e+01 | 0.85 | 2.355200e+02 | numeric |
| part_00_density_Z_5_0 | 1.489000e+01 | 1.344000e+01 | 9.730000e+00 | 0.87 | 1.182300e+02 | numeric |
| part_00_density_Z_5_1 | 2.185000e+01 | 1.860000e+01 | 1.519000e+01 | 2.14 | 1.670800e+02 | numeric |
| part_00_density_Z_4_2 | 3.225000e+01 | 2.614000e+01 | 2.349000e+01 | 1.01 | 2.284400e+02 | numeric |
| part_00_density_Z_1_0 | 1.420000e+00 | 2.200000e-01 | 1.390000e+00 | 0.68 | 2.400000e+00 | numeric |
| part_00_density_Z_4_1 | 2.851000e+01 | 2.309000e+01 | 2.093000e+01 | 0.76 | 1.810600e+02 | numeric |
| part_00_density_Z_7_2 | 2.760000e+01 | 2.528000e+01 | 1.751000e+01 | 2.89 | 1.951200e+02 | numeric |
| part_00_density_Z_4_0 | 1.700000e+01 | 1.404000e+01 | 1.273000e+01 | 0.01 | 1.196500e+02 | numeric |
| part_01_shape_segments_count | 2.830100e+02 | 9.662200e+02 | 1.500000e+01 | 0.00 | 6.920200e+04 | numeric |
| part_01_density_segments_count | 2.830100e+02 | 9.662200e+02 | 1.500000e+01 | 0.00 | 6.920200e+04 | numeric |
| part_01_volume | 2.525000e+01 | 4.106000e+01 | 1.031000e+01 | 0.00 | 1.996250e+03 | numeric |
| part_01_electrons | 1.504000e+01 | 2.287000e+01 | 6.120000e+00 | 0.00 | 3.957000e+02 | numeric |
| part_01_mean | 6.500000e-01 | 4.300000e-01 | 5.600000e-01 | 0.00 | 8.860000e+00 | numeric |
| part_01_std | 2.000000e-01 | 3.000000e-01 | 1.100000e-01 | 0.00 | 8.080000e+00 | numeric |
| part_01_max | 1.350000e+00 | 1.590000e+00 | 8.900000e-01 | 0.00 | 4.463000e+01 | numeric |
| part_01_max_over_std | 9.710000e+00 | 7.640000e+00 | 7.220000e+00 | 0.00 | 1.732500e+02 | numeric |
| part_01_skewness | 2.000000e-01 | 3.400000e-01 | 1.000000e-01 | 0.00 | 1.077000e+01 | numeric |
| part_01_parts | 1.270000e+00 | 7.000000e-01 | 1.000000e+00 | 0.00 | 2.400000e+01 | numeric |
| part_01_shape_O3 | 1.276332e+06 | 7.453268e+06 | 5.937043e+04 | 74.84 | 1.837172e+09 | numeric |
| part_01_shape_O4 | 5.888376e+12 | 4.679577e+14 | 8.720523e+08 | 1818.62 | 1.680769e+17 | numeric |
| part_01_shape_O5 | 5.793748e+19 | 1.314826e+22 | 3.512181e+12 | 13532.12 | 5.890566e+24 | numeric |
| part_01_shape_FL | 3.084952e+16 | 7.231544e+18 | 1.046845e+10 | 0.00 | 2.872585e+21 | numeric |
| part_01_shape_O3_norm | 5.300000e-01 | 4.300000e-01 | 3.700000e-01 | 0.23 | 4.375000e+01 | numeric |
| part_01_shape_O4_norm | 7.000000e-02 | 1.100000e-01 | 3.000000e-02 | 0.02 | 1.175000e+01 | numeric |
| part_01_shape_O5_norm | 0.000000e+00 | 1.000000e-02 | 0.000000e+00 | 0.00 | 1.230000e+00 | numeric |
| part_01_shape_FL_norm | 1.400000e-01 | 2.230000e+00 | 0.000000e+00 | 0.00 | 5.526100e+02 | numeric |
| part_01_shape_I1 | 2.410654e+09 | 2.754567e+11 | 3.905652e+06 | 210.50 | 1.307399e+14 | numeric |
| part_01_shape_I2 | 1.474496e+20 | 4.067568e+22 | 2.238828e+12 | 10919.68 | 1.923967e+25 | numeric |
| part_01_shape_I3 | 7.544764e+22 | 3.200528e+25 | 6.116160e+12 | 9454.58 | 1.708172e+28 | numeric |
| part_01_shape_I4 | 1.783789e+16 | 3.751691e+18 | 5.856015e+09 | 0.00 | 1.407712e+21 | numeric |
| part_01_shape_I5 | 9.163466e+15 | 1.520986e+18 | 1.619440e+09 | 0.00 | 4.931548e+20 | numeric |
| part_01_shape_I6 | 1.273856e+18 | 4.557894e+20 | 1.045825e+11 | 5359.46 | 2.400810e+23 | numeric |
| part_01_shape_I1_norm | 7.700000e-01 | 8.740000e+00 | 2.200000e-01 | 0.06 | 3.426140e+03 | numeric |
| part_01_shape_I2_norm | 2.400000e-01 | 1.197000e+01 | 1.000000e-02 | 0.00 | 6.729610e+03 | numeric |
| part_01_shape_I3_norm | 7.693000e+01 | 2.330541e+04 | 2.000000e-02 | 0.00 | 1.173424e+07 | numeric |
| part_01_shape_I4_norm | 1.100000e-01 | 2.270000e+00 | 0.000000e+00 | 0.00 | 6.056100e+02 | numeric |
| part_01_shape_I5_norm | 1.000000e-01 | 2.310000e+00 | 0.000000e+00 | 0.00 | 6.409500e+02 | numeric |
| part_01_shape_I6_norm | 1.790000e+00 | 3.274700e+02 | 4.000000e-02 | 0.00 | 1.498618e+05 | numeric |
| part_01_shape_M000 | 3.186450e+03 | 5.123700e+03 | 1.328000e+03 | 32.00 | 2.495310e+05 | numeric |
| part_01_shape_CI | 3.000000e-02 | 4.530000e+00 | 0.000000e+00 | -142.64 | 7.127000e+01 | numeric |
| part_01_shape_E3_E1 | 2.500000e-01 | 2.100000e-01 | 1.800000e-01 | 0.00 | 9.900000e-01 | numeric |
| part_01_shape_E2_E1 | 4.300000e-01 | 2.500000e-01 | 4.000000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_01_shape_E3_E2 | 5.700000e-01 | 2.300000e-01 | 5.900000e-01 | 0.01 | 1.000000e+00 | numeric |
| part_01_shape_sqrt_E1 | 7.450000e+00 | 5.990000e+00 | 5.270000e+00 | 0.93 | 2.023700e+02 | numeric |
| part_01_shape_sqrt_E2 | 4.030000e+00 | 2.730000e+00 | 3.160000e+00 | 0.53 | 3.206000e+01 | numeric |
| part_01_shape_sqrt_E3 | 2.660000e+00 | 1.410000e+00 | 2.350000e+00 | 0.37 | 1.906000e+01 | numeric |
| part_01_density_O3 | 6.719290e+05 | 2.504688e+06 | 3.307433e+04 | 5.07 | 3.763261e+08 | numeric |
| part_01_density_O4 | 9.962197e+11 | 2.296292e+13 | 2.639872e+08 | 6.52 | 8.774435e+15 | numeric |
| part_01_density_O5 | 9.393053e+17 | 8.694761e+19 | 5.760985e+11 | 1.74 | 3.619558e+22 | numeric |
| part_01_density_FL | 2.372681e+15 | 4.490618e+17 | 2.842058e+09 | -11.69 | 2.206107e+20 | numeric |
| part_01_density_O3_norm | 7.700000e-01 | 1.090000e+00 | 5.700000e-01 | 0.04 | 3.097900e+02 | numeric |
| part_01_density_O4_norm | 1.600000e-01 | 3.000000e-01 | 8.000000e-02 | 0.00 | 3.095000e+01 | numeric |
| part_01_density_O5_norm | 1.000000e-02 | 3.000000e-02 | 0.000000e+00 | 0.00 | 3.740000e+00 | numeric |
| part_01_density_FL_norm | 7.900000e-01 | 2.885000e+01 | 1.000000e-02 | -0.04 | 1.474174e+04 | numeric |
| part_01_density_I1 | 8.851343e+08 | 2.810244e+10 | 2.011426e+06 | 23.86 | 1.098793e+13 | numeric |
| part_01_density_I2 | 8.447497e+18 | 2.080879e+21 | 5.998141e+11 | 81.51 | 9.956763e+23 | numeric |
| part_01_density_I3 | 7.378980e+20 | 2.363363e+23 | 1.584537e+12 | 186.52 | 1.206633e+26 | numeric |
| part_01_density_I4 | 1.469681e+15 | 2.535361e+17 | 1.703236e+09 | -3.02 | 1.278331e+20 | numeric |
| part_01_density_I5 | 8.676808e+14 | 1.250201e+17 | 6.184899e+08 | 0.00 | 6.598137e+19 | numeric |
| part_01_density_I6 | 2.664503e+16 | 5.527029e+18 | 2.976817e+10 | 63.49 | 2.140061e+21 | numeric |
| part_01_density_I1_norm | 3.090000e+00 | 3.784700e+02 | 5.100000e-01 | 0.00 | 1.711294e+05 | numeric |
| part_01_density_I2_norm | 2.250000e+00 | 1.563800e+02 | 4.000000e-02 | 0.00 | 7.025807e+04 | numeric |
| part_01_density_I3_norm | 1.445226e+05 | 5.385298e+07 | 1.100000e-01 | 0.00 | 2.926873e+10 | numeric |
| part_01_density_I4_norm | 7.200000e-01 | 2.927000e+01 | 1.000000e-02 | -0.02 | 1.478973e+04 | numeric |
| part_01_density_I5_norm | 6.700000e-01 | 2.965000e+01 | 0.000000e+00 | 0.00 | 1.482173e+04 | numeric |
| part_01_density_I6_norm | 3.171300e+02 | 1.042647e+05 | 1.300000e-01 | 0.00 | 5.299373e+07 | numeric |
| part_01_density_M000 | 1.897890e+03 | 2.852510e+03 | 7.915200e+02 | 1.44 | 4.946192e+04 | numeric |
| part_01_density_CI | 4.000000e-02 | 5.090000e+00 | 0.000000e+00 | -162.58 | 1.049000e+02 | numeric |
| part_01_density_E3_E1 | 2.600000e-01 | 2.100000e-01 | 1.800000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_01_density_E2_E1 | 4.300000e-01 | 2.500000e-01 | 4.000000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_01_density_E3_E2 | 5.700000e-01 | 2.400000e-01 | 5.900000e-01 | 0.01 | 1.000000e+00 | numeric |
| part_01_density_sqrt_E1 | 7.190000e+00 | 5.870000e+00 | 4.960000e+00 | 0.93 | 2.021700e+02 | numeric |
| part_01_density_sqrt_E2 | 3.840000e+00 | 2.640000e+00 | 2.970000e+00 | 0.53 | 3.077000e+01 | numeric |
| part_01_density_sqrt_E3 | 2.530000e+00 | 1.340000e+00 | 2.230000e+00 | 0.37 | 1.869000e+01 | numeric |
| part_01_shape_Z_7_3 | 3.578000e+01 | 3.373000e+01 | 2.195000e+01 | 4.61 | 4.702800e+02 | numeric |
| part_01_shape_Z_0_0 | 2.243000e+01 | 1.598000e+01 | 1.781000e+01 | 2.76 | 2.440700e+02 | numeric |
| part_01_shape_Z_7_0 | 1.597000e+01 | 1.540000e+01 | 8.700000e+00 | 0.71 | 2.915600e+02 | numeric |
| part_01_shape_Z_7_1 | 2.494000e+01 | 2.397000e+01 | 1.439000e+01 | 3.42 | 3.709900e+02 | numeric |
| part_01_shape_Z_3_0 | 1.341000e+01 | 1.216000e+01 | 9.020000e+00 | 0.63 | 1.917600e+02 | numeric |
| part_01_shape_Z_5_2 | 3.023000e+01 | 2.696000e+01 | 2.051000e+01 | 3.10 | 4.033000e+02 | numeric |
| part_01_shape_Z_6_1 | 2.730000e+01 | 2.665000e+01 | 1.695000e+01 | 0.79 | 3.717000e+02 | numeric |
| part_01_shape_Z_3_1 | 2.127000e+01 | 1.774000e+01 | 1.547000e+01 | 2.42 | 2.606700e+02 | numeric |
| part_01_shape_Z_6_0 | 1.306000e+01 | 1.346000e+01 | 8.210000e+00 | 0.01 | 2.636200e+02 | numeric |
| part_01_shape_Z_2_1 | 3.264000e+01 | 2.569000e+01 | 2.422000e+01 | 1.71 | 3.463800e+02 | numeric |
| part_01_shape_Z_6_3 | 4.009000e+01 | 3.784000e+01 | 2.567000e+01 | 3.40 | 5.043900e+02 | numeric |
| part_01_shape_Z_2_0 | 2.378000e+01 | 1.850000e+01 | 1.816000e+01 | 0.05 | 2.846000e+02 | numeric |
| part_01_shape_Z_6_2 | 3.602000e+01 | 3.467000e+01 | 2.267000e+01 | 2.46 | 4.666900e+02 | numeric |
| part_01_shape_Z_5_0 | 1.638000e+01 | 1.587000e+01 | 9.780000e+00 | 0.77 | 2.952300e+02 | numeric |
| part_01_shape_Z_5_1 | 2.492000e+01 | 2.270000e+01 | 1.669000e+01 | 2.41 | 3.681300e+02 | numeric |
| part_01_shape_Z_4_2 | 3.730000e+01 | 3.316000e+01 | 2.569000e+01 | 2.22 | 4.224900e+02 | numeric |
| part_01_shape_Z_1_0 | 1.540000e+00 | 3.100000e-01 | 1.500000e+00 | 0.70 | 4.280000e+00 | numeric |
| part_01_shape_Z_4_1 | 3.183000e+01 | 2.898000e+01 | 2.177000e+01 | 1.14 | 3.746300e+02 | numeric |
| part_01_shape_Z_7_2 | 3.171000e+01 | 3.046000e+01 | 1.890000e+01 | 4.02 | 4.472700e+02 | numeric |
| part_01_shape_Z_4_0 | 1.736000e+01 | 1.655000e+01 | 1.171000e+01 | 0.01 | 2.751700e+02 | numeric |
| part_01_density_Z_7_3 | 2.800000e+01 | 2.657000e+01 | 1.657000e+01 | 2.60 | 2.030500e+02 | numeric |
| part_01_density_Z_0_0 | 1.727000e+01 | 1.238000e+01 | 1.375000e+01 | 0.59 | 1.086700e+02 | numeric |
| part_01_density_Z_7_0 | 1.425000e+01 | 1.315000e+01 | 7.810000e+00 | 1.06 | 1.251700e+02 | numeric |
| part_01_density_Z_7_1 | 2.077000e+01 | 1.963000e+01 | 1.162000e+01 | 1.91 | 1.556800e+02 | numeric |
| part_01_density_Z_3_0 | 1.067000e+01 | 9.470000e+00 | 6.910000e+00 | 0.44 | 8.632000e+01 | numeric |
| part_01_density_Z_5_2 | 2.343000e+01 | 2.094000e+01 | 1.539000e+01 | 2.07 | 1.804900e+02 | numeric |
| part_01_density_Z_6_1 | 2.208000e+01 | 2.202000e+01 | 1.397000e+01 | 0.37 | 1.685900e+02 | numeric |
| part_01_density_Z_3_1 | 1.606000e+01 | 1.353000e+01 | 1.119000e+01 | 1.00 | 1.174500e+02 | numeric |
| part_01_density_Z_6_0 | 1.133000e+01 | 1.240000e+01 | 6.270000e+00 | 0.01 | 1.174500e+02 | numeric |
| part_01_density_Z_2_1 | 2.527000e+01 | 1.959000e+01 | 1.919000e+01 | 0.71 | 1.709200e+02 | numeric |
| part_01_density_Z_6_3 | 3.089000e+01 | 2.989000e+01 | 1.986000e+01 | 1.05 | 2.519900e+02 | numeric |
| part_01_density_Z_2_0 | 1.929000e+01 | 1.465000e+01 | 1.493000e+01 | 0.06 | 1.281300e+02 | numeric |
| part_01_density_Z_6_2 | 2.821000e+01 | 2.768000e+01 | 1.804000e+01 | 0.88 | 2.259600e+02 | numeric |
| part_01_density_Z_5_0 | 1.400000e+01 | 1.296000e+01 | 8.280000e+00 | 0.79 | 1.139400e+02 | numeric |
| part_01_density_Z_5_1 | 1.998000e+01 | 1.795000e+01 | 1.294000e+01 | 1.93 | 1.648700e+02 | numeric |
| part_01_density_Z_4_2 | 2.887000e+01 | 2.558000e+01 | 2.038000e+01 | 0.84 | 2.217200e+02 | numeric |
| part_01_density_Z_1_0 | 1.530000e+00 | 3.100000e-01 | 1.490000e+00 | 0.62 | 4.290000e+00 | numeric |
| part_01_density_Z_4_1 | 2.527000e+01 | 2.262000e+01 | 1.793000e+01 | 0.47 | 1.753100e+02 | numeric |
| part_01_density_Z_7_2 | 2.541000e+01 | 2.429000e+01 | 1.469000e+01 | 2.26 | 1.917000e+02 | numeric |
| part_01_density_Z_4_0 | 1.495000e+01 | 1.397000e+01 | 1.057000e+01 | 0.01 | 1.188000e+02 | numeric |
| part_02_shape_segments_count | 2.365800e+02 | 8.451900e+02 | 9.000000e+00 | 0.00 | 4.556400e+04 | numeric |
| part_02_density_segments_count | 2.365800e+02 | 8.451900e+02 | 9.000000e+00 | 0.00 | 4.556400e+04 | numeric |
| part_02_volume | 1.947000e+01 | 3.372000e+01 | 7.350000e+00 | 0.00 | 1.632540e+03 | numeric |
| part_02_electrons | 1.284000e+01 | 2.064000e+01 | 4.730000e+00 | 0.00 | 3.511900e+02 | numeric |
| part_02_mean | 6.800000e-01 | 4.800000e-01 | 6.000000e-01 | 0.00 | 9.760000e+00 | numeric |
| part_02_std | 1.900000e-01 | 3.100000e-01 | 9.000000e-02 | 0.00 | 8.260000e+00 | numeric |
| part_02_max | 1.320000e+00 | 1.600000e+00 | 8.800000e-01 | 0.00 | 4.463000e+01 | numeric |
| part_02_max_over_std | 9.480000e+00 | 7.860000e+00 | 7.220000e+00 | 0.00 | 1.732500e+02 | numeric |
| part_02_skewness | 1.900000e-01 | 3.400000e-01 | 8.000000e-02 | 0.00 | 1.089000e+01 | numeric |
| part_02_parts | 1.300000e+00 | 9.500000e-01 | 1.000000e+00 | 0.00 | 2.600000e+01 | numeric |
| part_02_shape_O3 | 1.014612e+06 | 5.732035e+06 | 4.992546e+04 | 72.00 | 1.487394e+09 | numeric |
| part_02_shape_O4 | 3.440276e+12 | 2.387292e+14 | 6.119596e+08 | 1728.00 | 8.781097e+16 | numeric |
| part_02_shape_O5 | 2.085633e+19 | 4.284804e+21 | 2.073642e+12 | 12288.00 | 2.033831e+24 | numeric |
| part_02_shape_FL | 1.662951e+16 | 3.744737e+18 | 6.558381e+09 | -61.16 | 1.394306e+21 | numeric |
| part_02_shape_O3_norm | 5.800000e-01 | 5.400000e-01 | 3.700000e-01 | 0.22 | 6.888000e+01 | numeric |
| part_02_shape_O4_norm | 9.000000e-02 | 1.700000e-01 | 3.000000e-02 | 0.02 | 2.169000e+01 | numeric |
| part_02_shape_O5_norm | 0.000000e+00 | 2.000000e-02 | 0.000000e+00 | 0.00 | 6.420000e+00 | numeric |
| part_02_shape_FL_norm | 3.600000e-01 | 9.310000e+00 | 0.000000e+00 | 0.00 | 3.374840e+03 | numeric |
| part_02_shape_I1 | 1.885939e+09 | 2.150612e+11 | 3.103373e+06 | 186.00 | 1.057686e+14 | numeric |
| part_02_shape_I2 | 8.098471e+19 | 2.193735e+22 | 1.384962e+12 | 9312.00 | 1.018994e+25 | numeric |
| part_02_shape_I3 | 4.906230e+22 | 2.008981e+25 | 3.838129e+12 | 7092.00 | 1.118111e+28 | numeric |
| part_02_shape_I4 | 9.749810e+15 | 1.969465e+18 | 3.736084e+09 | -21.56 | 7.546862e+20 | numeric |
| part_02_shape_I5 | 5.163344e+15 | 8.306804e+17 | 1.020489e+09 | 0.00 | 3.282730e+20 | numeric |
| part_02_shape_I6 | 8.204262e+17 | 2.860271e+20 | 6.931835e+10 | 4464.00 | 1.572613e+23 | numeric |
| part_02_shape_I1_norm | 1.060000e+00 | 1.552000e+01 | 2.300000e-01 | 0.06 | 8.356550e+03 | numeric |
| part_02_shape_I2_norm | 9.200000e-01 | 1.095600e+02 | 1.000000e-02 | 0.00 | 4.807317e+04 | numeric |
| part_02_shape_I3_norm | 2.581300e+02 | 1.139011e+05 | 3.000000e-02 | 0.00 | 6.981922e+07 | numeric |
| part_02_shape_I4_norm | 3.300000e-01 | 9.590000e+00 | 0.000000e+00 | 0.00 | 3.367000e+03 | numeric |
| part_02_shape_I5_norm | 3.100000e-01 | 9.840000e+00 | 0.000000e+00 | 0.00 | 3.361780e+03 | numeric |
| part_02_shape_I6_norm | 3.780000e+00 | 9.674900e+02 | 4.000000e-02 | 0.00 | 5.755581e+05 | numeric |
| part_02_shape_M000 | 2.613920e+03 | 4.162100e+03 | 1.185000e+03 | 32.00 | 2.040670e+05 | numeric |
| part_02_shape_CI | 3.000000e-02 | 4.710000e+00 | 1.000000e-02 | -153.75 | 1.470400e+02 | numeric |
| part_02_shape_E3_E1 | 2.700000e-01 | 2.100000e-01 | 2.300000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_02_shape_E2_E1 | 4.400000e-01 | 2.500000e-01 | 4.400000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_02_shape_E3_E2 | 5.800000e-01 | 2.300000e-01 | 5.900000e-01 | 0.01 | 1.000000e+00 | numeric |
| part_02_shape_sqrt_E1 | 7.110000e+00 | 5.900000e+00 | 5.110000e+00 | 0.87 | 2.018100e+02 | numeric |
| part_02_shape_sqrt_E2 | 3.790000e+00 | 2.650000e+00 | 3.060000e+00 | 0.57 | 2.965000e+01 | numeric |
| part_02_shape_sqrt_E3 | 2.500000e+00 | 1.350000e+00 | 2.300000e+00 | 0.42 | 1.822000e+01 | numeric |
| part_02_density_O3 | 5.898117e+05 | 2.142951e+06 | 3.142096e+04 | 9.54 | 3.232489e+08 | numeric |
| part_02_density_O4 | 7.561160e+11 | 1.618552e+13 | 2.389454e+08 | 26.47 | 6.115114e+15 | numeric |
| part_02_density_O5 | 5.906779e+17 | 5.133060e+19 | 4.901528e+11 | 19.58 | 2.094861e+22 | numeric |
| part_02_density_FL | 1.765939e+15 | 3.193372e+17 | 2.240732e+09 | -23.33 | 1.536992e+20 | numeric |
| part_02_density_O3_norm | 7.800000e-01 | 1.120000e+00 | 5.500000e-01 | 0.03 | 3.829800e+02 | numeric |
| part_02_density_O4_norm | 1.600000e-01 | 4.400000e-01 | 7.000000e-02 | 0.00 | 1.155500e+02 | numeric |
| part_02_density_O5_norm | 1.000000e-02 | 1.900000e-01 | 0.000000e+00 | 0.00 | 1.130500e+02 | numeric |
| part_02_density_FL_norm | 2.510000e+00 | 3.340100e+02 | 1.000000e-02 | 0.00 | 1.979128e+05 | numeric |
| part_02_density_I1 | 7.687959e+08 | 2.338538e+10 | 1.806941e+06 | 26.90 | 9.365256e+12 | numeric |
| part_02_density_I2 | 6.088224e+18 | 1.457944e+21 | 4.777123e+11 | 184.28 | 6.907748e+23 | numeric |
| part_02_density_I3 | 5.462508e+20 | 1.665822e+23 | 1.252898e+12 | 162.25 | 8.766542e+25 | numeric |
| part_02_density_I4 | 1.131192e+15 | 1.880488e+17 | 1.353432e+09 | -5.77 | 9.375584e+19 | numeric |
| part_02_density_I5 | 7.080263e+14 | 1.020142e+17 | 4.826866e+08 | 0.00 | 5.379362e+19 | numeric |
| part_02_density_I6 | 2.018508e+16 | 3.972588e+18 | 2.538124e+10 | 88.32 | 1.523784e+21 | numeric |
| part_02_density_I1_norm | 3.410000e+00 | 4.507300e+02 | 4.700000e-01 | 0.00 | 2.583039e+05 | numeric |
| part_02_density_I2_norm | 1.456000e+01 | 3.643240e+03 | 3.000000e-02 | 0.00 | 2.154056e+06 | numeric |
| part_02_density_I3_norm | 2.182020e+05 | 1.076679e+08 | 1.000000e-01 | 0.00 | 6.670879e+10 | numeric |
| part_02_density_I4_norm | 2.640000e+00 | 4.366500e+02 | 1.000000e-02 | 0.00 | 2.648333e+05 | numeric |
| part_02_density_I5_norm | 2.720000e+00 | 5.061100e+02 | 0.000000e+00 | 0.00 | 3.094470e+05 | numeric |
| part_02_density_I6_norm | 3.696800e+02 | 1.629474e+05 | 1.200000e-01 | 0.00 | 9.891203e+07 | numeric |
| part_02_density_M000 | 1.724400e+03 | 2.543160e+03 | 7.908900e+02 | 2.54 | 4.389833e+04 | numeric |
| part_02_density_CI | 3.000000e-02 | 5.250000e+00 | 1.000000e-02 | -166.26 | 1.675400e+02 | numeric |
| part_02_density_E3_E1 | 2.700000e-01 | 2.200000e-01 | 2.300000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_02_density_E2_E1 | 4.400000e-01 | 2.500000e-01 | 4.400000e-01 | 0.00 | 1.000000e+00 | numeric |
| part_02_density_E3_E2 | 5.800000e-01 | 2.300000e-01 | 5.900000e-01 | 0.01 | 1.000000e+00 | numeric |
| part_02_density_sqrt_E1 | 6.870000e+00 | 5.780000e+00 | 4.810000e+00 | 0.87 | 2.017100e+02 | numeric |
| part_02_density_sqrt_E2 | 3.630000e+00 | 2.560000e+00 | 2.870000e+00 | 0.57 | 2.875000e+01 | numeric |
| part_02_density_sqrt_E3 | 2.390000e+00 | 1.290000e+00 | 2.190000e+00 | 0.42 | 1.768000e+01 | numeric |
| part_02_shape_Z_7_3 | 3.261000e+01 | 3.051000e+01 | 2.041000e+01 | 5.84 | 4.144200e+02 | numeric |
| part_02_shape_Z_0_0 | 2.004000e+01 | 1.438000e+01 | 1.682000e+01 | 2.76 | 2.207200e+02 | numeric |
| part_02_shape_Z_7_0 | 1.521000e+01 | 1.395000e+01 | 8.970000e+00 | 0.91 | 2.288400e+02 | numeric |
| part_02_shape_Z_7_1 | 2.306000e+01 | 2.168000e+01 | 1.346000e+01 | 3.76 | 3.197400e+02 | numeric |
| part_02_shape_Z_3_0 | 1.239000e+01 | 1.114000e+01 | 8.500000e+00 | 0.66 | 1.951600e+02 | numeric |
| part_02_shape_Z_5_2 | 2.730000e+01 | 2.438000e+01 | 1.902000e+01 | 3.98 | 3.446100e+02 | numeric |
| part_02_shape_Z_6_1 | 2.438000e+01 | 2.434000e+01 | 1.569000e+01 | 0.97 | 3.223400e+02 | numeric |
| part_02_shape_Z_3_1 | 1.930000e+01 | 1.613000e+01 | 1.458000e+01 | 2.62 | 2.619600e+02 | numeric |
| part_02_shape_Z_6_0 | 1.186000e+01 | 1.257000e+01 | 7.690000e+00 | 0.00 | 1.878500e+02 | numeric |
| part_02_shape_Z_2_1 | 2.896000e+01 | 2.320000e+01 | 2.248000e+01 | 1.59 | 3.159100e+02 | numeric |
| part_02_shape_Z_6_3 | 3.586000e+01 | 3.442000e+01 | 2.391000e+01 | 3.18 | 4.571400e+02 | numeric |
| part_02_shape_Z_2_0 | 2.096000e+01 | 1.676000e+01 | 1.673000e+01 | 0.06 | 2.567500e+02 | numeric |
| part_02_shape_Z_6_2 | 3.207000e+01 | 3.148000e+01 | 2.094000e+01 | 2.17 | 4.218400e+02 | numeric |
| part_02_shape_Z_5_0 | 1.525000e+01 | 1.442000e+01 | 9.070000e+00 | 0.88 | 2.418300e+02 | numeric |
| part_02_shape_Z_5_1 | 2.251000e+01 | 2.045000e+01 | 1.537000e+01 | 2.71 | 3.003900e+02 | numeric |
| part_02_shape_Z_4_2 | 3.304000e+01 | 3.008000e+01 | 2.360000e+01 | 2.23 | 3.798200e+02 | numeric |
| part_02_shape_Z_1_0 | 1.650000e+00 | 4.000000e-01 | 1.630000e+00 | 0.67 | 4.980000e+00 | numeric |
| part_02_shape_Z_4_1 | 2.800000e+01 | 2.626000e+01 | 1.975000e+01 | 0.88 | 3.450400e+02 | numeric |
| part_02_shape_Z_7_2 | 2.887000e+01 | 2.744000e+01 | 1.746000e+01 | 4.53 | 3.775500e+02 | numeric |
| part_02_shape_Z_4_0 | 1.535000e+01 | 1.518000e+01 | 1.055000e+01 | 0.01 | 2.135400e+02 | numeric |
| part_02_density_Z_7_3 | 2.684000e+01 | 2.500000e+01 | 1.618000e+01 | 3.21 | 2.001800e+02 | numeric |
| part_02_density_Z_0_0 | 1.628000e+01 | 1.168000e+01 | 1.374000e+01 | 0.78 | 1.023700e+02 | numeric |
| part_02_density_Z_7_0 | 1.410000e+01 | 1.240000e+01 | 8.520000e+00 | 1.21 | 1.229800e+02 | numeric |
| part_02_density_Z_7_1 | 2.008000e+01 | 1.846000e+01 | 1.143000e+01 | 1.99 | 1.516200e+02 | numeric |
| part_02_density_Z_3_0 | 1.039000e+01 | 9.020000e+00 | 6.870000e+00 | 0.53 | 8.338000e+01 | numeric |
| part_02_density_Z_5_2 | 2.231000e+01 | 1.975000e+01 | 1.507000e+01 | 2.33 | 1.784800e+02 | numeric |
| part_02_density_Z_6_1 | 2.051000e+01 | 2.094000e+01 | 1.312000e+01 | 0.46 | 1.621400e+02 | numeric |
| part_02_density_Z_3_1 | 1.541000e+01 | 1.282000e+01 | 1.119000e+01 | 1.96 | 1.138300e+02 | numeric |
| part_02_density_Z_6_0 | 1.065000e+01 | 1.200000e+01 | 5.880000e+00 | 0.01 | 1.151500e+02 | numeric |
| part_02_density_Z_2_1 | 2.364000e+01 | 1.850000e+01 | 1.891000e+01 | 0.78 | 1.652800e+02 | numeric |
| part_02_density_Z_6_3 | 2.891000e+01 | 2.831000e+01 | 1.914000e+01 | 0.98 | 2.429600e+02 | numeric |
| part_02_density_Z_2_0 | 1.793000e+01 | 1.391000e+01 | 1.461000e+01 | 0.03 | 1.205600e+02 | numeric |
| part_02_density_Z_6_2 | 2.624000e+01 | 2.618000e+01 | 1.718000e+01 | 0.77 | 2.156300e+02 | numeric |
| part_02_density_Z_5_0 | 1.361000e+01 | 1.226000e+01 | 8.200000e+00 | 0.64 | 1.106500e+02 | numeric |
| part_02_density_Z_5_1 | 1.900000e+01 | 1.687000e+01 | 1.257000e+01 | 1.68 | 1.624200e+02 | numeric |
| part_02_density_Z_4_2 | 2.679000e+01 | 2.426000e+01 | 1.968000e+01 | 0.78 | 2.142100e+02 | numeric |
| part_02_density_Z_1_0 | 1.650000e+00 | 4.000000e-01 | 1.620000e+00 | 0.61 | 4.960000e+00 | numeric |
| part_02_density_Z_4_1 | 2.324000e+01 | 2.147000e+01 | 1.712000e+01 | 0.40 | 1.689100e+02 | numeric |
| part_02_density_Z_7_2 | 2.429000e+01 | 2.277000e+01 | 1.425000e+01 | 2.61 | 1.884000e+02 | numeric |
| part_02_density_Z_4_0 | 1.371000e+01 | 1.346000e+01 | 9.920000e+00 | 0.01 | 1.186200e+02 | numeric |
| resolution | 2.150000e+00 | 5.400000e-01 | 2.070000e+00 | 0.48 | 8.200000e+00 | numeric |
| FoFc_mean | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.00 | 0.000000e+00 | numeric |
| FoFc_std | 1.300000e-01 | 5.000000e-02 | 1.200000e-01 | 0.01 | 9.400000e-01 | numeric |
| FoFc_square_std | 2.000000e-02 | 2.000000e-02 | 1.000000e-02 | 0.00 | 8.900000e-01 | numeric |
| FoFc_min | -7.000000e-01 | 3.000000e-01 | -6.600000e-01 | -7.55 | -4.000000e-02 | numeric |
| FoFc_max | 2.600000e+00 | 2.540000e+00 | 1.840000e+00 | 0.04 | 4.526000e+01 | numeric |
| res_name | Przyklady |
|---|---|
| SO4 | 38757 |
| GOL | 27615 |
| EDO | 21169 |
| NAG | 17941 |
| CL | 15627 |
| CA | 14377 |
| ZN | 13568 |
| MG | 9960 |
| HEM | 7397 |
| PO4 | 7336 |
| ACT | 5430 |
| DMS | 4601 |
| IOD | 4367 |
| PEG | 3455 |
| NAD | 3278 |
| K | 3220 |
| FAD | 3100 |
| MN | 2824 |
| CLA | 2654 |
| ADP | 2574 |
| MLY | 2413 |
| NAP | 2327 |
| CD | 2264 |
| UNX | 2171 |
| MPD | 2165 |
| PG4 | 2098 |
| MAN | 1955 |
| FMT | 1951 |
| MES | 1852 |
| 1PE | 1543 |
| ATP | 1523 |
| CU | 1518 |
| COA | 1497 |
| BR | 1454 |
| FMN | 1439 |
| EPE | 1374 |
| NDP | 1312 |
| PGE | 1261 |
| HEC | 1236 |
| NI | 1167 |
| TRS | 1152 |
| NO3 | 1144 |
| ACY | 1138 |
| SF4 | 1132 |
| FE | 1085 |
| SAH | 1084 |
| PLP | 1067 |
| GDP | 1062 |
| UNK | 1032 |
| C8E | 1020 |
Tabela przedstawia korelację pomiędzy poszczególnymi liczbowymi kolumnami zbioru przy pomocy funkcji cor(), wyświetlając wyniki gdzie wartość bezwzględna jest większa od 0,6.
Niezgodnosc obliczona za pomoca zsumowanej ilości wierszy w których występuje różnica pomiędzy wartościami.
| res_name | Niezgodnosc |
|---|---|
| NAG | 17507 |
| MLY | 2395 |
| MAN | 1763 |
| UNK | 1032 |
| PLP | 933 |
| CLA | 847 |
| 1PE | 711 |
| C8E | 564 |
| PG4 | 489 |
| NAP | 214 |
| res_name | Niezgodnosc |
|---|---|
| NAG | 17507 |
| MLY | 2395 |
| MAN | 1763 |
| UNK | 1032 |
| PLP | 933 |
| CLA | 847 |
| 1PE | 711 |
| C8E | 564 |
| PG4 | 489 |
| NAP | 214 |
set.seed(123)
reg_at<-cor_pdb%>%filter(X2=="local_res_atom_non_h_count", X1!=c("dict_atom_non_h_count", "dict_atom_non_h_electron_sum"))
reg_at_names <- reg_at[,1]
regresja_at <- pdb_clear_last%>%
select(reg_at_names, local_res_atom_non_h_count)
idx_at <- createDataPartition(pdb_clear_last$local_res_atom_non_h_count,
p=0.7, list=F)
training_at <- pdb_clear_last[idx_at,]
testing_at <- pdb_clear_last[-idx_at,]
control <- trainControl(method="repeatedcv", number=2, repeats = 5)
fit_at <- train(local_res_atom_non_h_count ~ ., data=training_at, method="glm", metric="RMSE", trControl=control)
predAt<- predict(fit_at, newdata=testing_at)
postResample(predAt,testing_at$local_res_atom_non_h_count)
## RMSE Rsquared MAE
## 0.13184190 0.99990067 0.02184218
Wartość RMSE zbliżona do 0. Dla miary R^2 również uzyskano zadowalający wynik zbliżony do 1.
set.seed(123)
reg_el<-cor_pdb%>%filter(X2=="local_res_atom_non_h_electron_sum", X1!=c("dict_atom_non_h_count", "dict_atom_non_h_electron_sum", "local_res_atom_non_h_electron_sum"))
reg_el_names <- reg_at[,1]
regresja_el <- pdb_clear_last%>%
select(reg_el_names, local_res_atom_non_h_count)
idx_el <- createDataPartition(pdb_clear_last$local_res_atom_non_h_electron_sum,
p=0.7, list=F)
training_el <- pdb_clear_last[idx_el,]
testing_el <- pdb_clear_last[-idx_el,]
control <- trainControl(method="repeatedcv", number=2, repeats = 5)
fit_el <- train(local_res_atom_non_h_electron_sum ~ ., data=regresja_el, method="glm", metric="RMSE", trControl=control)
predEl<- predict(fit_el, newdata=testing_el)
postResample(predEl,testing_el$local_res_atom_non_h_electron_sum)
## RMSE Rsquared MAE
## 12.9018846 0.9791096 8.8296217
Wartość miary RMSE niezadowalająca, nie udało się uzyskać dobrego wyniku dla liczby elektronów.
Ze względu na bardzo długi czas przetwarzania danych, rozmiary zbioru zostały zmniejszone, a parametr ntree został ograniczony do 5. Dla osiągnięcia lepszych wyników próbowano zwiększyć parametr ntree, ale już dla wartości podstawej 500 przy wykorzystywaniu 1000 wierszy zbioru czas ładowania przekraczał godzinę.
pdb_clear_50_rf <- pdb_clear_50%>%select(-dict_atom_non_h_electron_sum, -dict_atom_non_h_count,-part_00_density_segments_count, -part_01_density_segments_count,-part_02_density_segments_count, -local_res_atom_non_h_electron_sum,-local_res_atom_non_h_count)
ogranicz<-createDataPartition(pdb_clear_50_rf$res_name,
p=0.9, list=F)
pdb_clear_50_rf<-pdb_clear_50_rf[-ogranicz,]
idx_kl <- createDataPartition(pdb_clear_50_rf$res_name,
p=0.7, list=F)
training_kl <- pdb_clear_50_rf[idx_kl,]
testing_kl <- pdb_clear_50_rf[-idx_kl,]
control <- trainControl(method="repeatedcv", number=2, repeats = 5)
set.seed(123)
fit_kl <- train(as.factor(res_name) ~ .,
data = training_kl,
method = "rf",
trControl = control,
ntree = 5
)
rfRes <- predict(fit_kl, newdata = testing_kl)
cm_1<-confusionMatrix(data = rfRes,
factor(testing_kl[,1]))
cm_1$overall
## Accuracy Kappa AccuracyLower AccuracyUpper AccuracyNull
## 0.3403444 0.2878040 0.3296938 0.3511175 0.1527540
## AccuracyPValue McnemarPValue
## 0.0000000 NaN
Uzyskany wynik nie jest zadowalający. Wynika on z ograniczenia ilości próbek oraz bardzo małej wartości parametru ntree.